智能论文笔记

On Optimizing Interventions in Shared Autonomy

Weihao Tan , David Koleczek , Siddhant Pradhan , Nicholas Perello , Vivek Chettiar , Vishal Rohra , Aaslesha Rajaram , Soundararajan Srinivasan , H M Sajjad Hossain , Yash Chandak

分类：人工智能 | 机器学习

2021-12-16

分享自治是指使自治工人能够与人类合作的方法，以提高人类性能。然而，除了提高性能之外，它通常也可能是有益的，代理同时考虑保留用户的经验或合作满意度。为了解决这一额外目标，我们通过约束自主代理的干预次数来研究改进用户体验的方法。我们提出了两种无模型的加强学习方法，可以考虑到干预措施的艰难和软限制。我们表明，我们的方法不仅表现出现有的基线，而且还消除了手动调整黑匣子超参数，以控制援助水平。我们还提供了对干预情景的深入分析，以进一步照亮系统理解。

translated by 谷歌翻译

Multimodal Wildland Fire Smoke Detection

Siddhant Baldota , Shreyas Anantha Ramaprasad , Jaspreet Kaur Bhamra , Shane Luna , Ravi Ramachandra , Eugene Zen , Harrison Kim , Daniel Crawl , Ismael Perez , Ilkay Altintas

分类：计算机视觉

2022-12-29

Research has shown that climate change creates warmer temperatures and drier conditions, leading to longer wildfire seasons and increased wildfire risks in the United States. These factors have in turn led to increases in the frequency, extent, and severity of wildfires in recent years. Given the danger posed by wildland fires to people, property, wildlife, and the environment, there is an urgency to provide tools for effective wildfire management. Early detection of wildfires is essential to minimizing potentially catastrophic destruction. In this paper, we present our work on integrating multiple data sources in SmokeyNet, a deep learning model using spatio-temporal information to detect smoke from wildland fires. Camera image data is integrated with weather sensor measurements and processed by SmokeyNet to create a multimodal wildland fire smoke detection system. We present our results comparing performance in terms of both accuracy and time-to-detection for multimodal data vs. a single data source. With a time-to-detection of only a few minutes, SmokeyNet can serve as an automated early notification system, providing a useful tool in the fight against destructive wildfires.

translated by 谷歌翻译

SLUE Phase-2: A Benchmark Suite of Diverse Spoken Language Understanding Tasks

Suwon Shon , Siddhant Arora , Chyi-Jiunn Lin , Ankita Pasad , Felix Wu , Roshan Sharma , Wei-Lun Wu , Hung-Yi Lee , Karen Livescu , Shinji Watanabe

分类：自然语言处理

2022-12-20

Spoken language understanding (SLU) tasks have been studied for many decades in the speech research community, but have not received as much attention as lower-level tasks like speech and speaker recognition. In particular, there are not nearly as many SLU task benchmarks, and many of the existing ones use data that is not freely available to all researchers. Recent work has begun to introduce such benchmark datasets for several tasks. In this work, we introduce several new annotated SLU benchmark tasks based on freely available speech data, which complement existing benchmarks and address gaps in the SLU evaluation landscape. We contribute four tasks: question answering and summarization involve inference over longer speech sequences; named entity localization addresses the speech-specific task of locating the targeted content in the signal; dialog act classification identifies the function of a given speech utterance. We follow the blueprint of the Spoken Language Understanding Evaluation (SLUE) benchmark suite. In order to facilitate the development of SLU models that leverage the success of pre-trained speech representations, we will be publishing for each task (i) annotations for a relatively small fine-tuning set, (ii) annotated development and test sets, and (iii) baseline models for easy reproducibility and comparisons. In this work, we present the details of data collection and annotation and the performance of the baseline models. We also perform sensitivity analysis of pipeline models' performance (speech recognizer + text model) to the speech recognition accuracy, using more than 20 state-of-the-art speech recognition models.

translated by 谷歌翻译

Associations Between Natural Language Processing (NLP) Enriched Social Determinants of Health and Suicide Death among US Veterans

Avijit Mitra , Richeek Pradhan , Rachel D Melamed , Kun Chen , David C Hoaglin , Katherine L Tucker , Joel I Reisman , Zhichao Yang , Weisong Liu , Jack Tsai

分类：自然语言处理

2022-12-11

Importance: Social determinants of health (SDOH) are known to be associated with increased risk of suicidal behaviors, but few studies utilized SDOH from unstructured electronic health record (EHR) notes. Objective: To investigate associations between suicide and recent SDOH, identified using structured and unstructured data. Design: Nested case-control study. Setting: EHR data from the US Veterans Health Administration (VHA). Participants: 6,122,785 Veterans who received care in the US VHA between October 1, 2010, and September 30, 2015. Exposures: Occurrence of SDOH over a maximum span of two years compared with no occurrence of SDOH. Main Outcomes and Measures: Cases of suicide deaths were matched with 4 controls on birth year, cohort entry date, sex, and duration of follow-up. We developed an NLP system to extract SDOH from unstructured notes. Structured data, NLP on unstructured data, and combining them yielded seven, eight and nine SDOH respectively. Adjusted odds ratios (aORs) and 95% confidence intervals (CIs) were estimated using conditional logistic regression. Results: In our cohort, 8,821 Veterans committed suicide during 23,725,382 person-years of follow-up (incidence rate 37.18 /100,000 person-years). Our cohort was mostly male (92.23%) and white (76.99%). Across the six common SDOH as covariates, NLP-extracted SDOH, on average, covered 84.38% of all SDOH occurrences. All SDOH, measured by structured data and NLP, were significantly associated with increased risk of suicide. The SDOH with the largest effects was legal problems (aOR=2.67, 95% CI=2.46-2.89), followed by violence (aOR=2.26, 95% CI=2.11-2.43). NLP-extracted and structured SDOH were also associated with suicide. Conclusions and Relevance: NLP-extracted SDOH were always significantly associated with increased risk of suicide among Veterans, suggesting the potential of NLP in public health studies.

translated by 谷歌翻译

Nostradamus: Weathering Worth

Alapan Chaudhuri , Zeeshan Ahmed , Ashwin Rao , Shivansh Subramanian , Shreyas Pradhan , Abhishek Mittal

分类：机器学习

2022-12-08

Nostradamus, inspired by the French astrologer and reputed seer, is a detailed study exploring relations between environmental factors and changes in the stock market. In this paper, we analyze associative correlation and causation between environmental elements and stock prices based on the US financial market, global climate trends, and daily weather records to demonstrate significant relationships between climate and stock price fluctuation. Our analysis covers short and long-term rises and dips in company stock performances. Lastly, we take four natural disasters as a case study to observe their effect on the emotional state of people and their influence on the stock market.

translated by 谷歌翻译

SSDNeRF: Semantic Soft Decomposition of Neural Radiance Fields

Siddhant Ranade , Christoph Lassner , Kai Li , Christian Haene , Shen-Chi Chen , Jean-Charles Bazin , Sofien Bouaziz

分类：计算机视觉

2022-12-07

Neural Radiance Fields (NeRFs) encode the radiance in a scene parameterized by the scene's plenoptic function. This is achieved by using an MLP together with a mapping to a higher-dimensional space, and has been proven to capture scenes with a great level of detail. Naturally, the same parameterization can be used to encode additional properties of the scene, beyond just its radiance. A particularly interesting property in this regard is the semantic decomposition of the scene. We introduce a novel technique for semantic soft decomposition of neural radiance fields (named SSDNeRF) which jointly encodes semantic signals in combination with radiance signals of a scene. Our approach provides a soft decomposition of the scene into semantic parts, enabling us to correctly encode multiple semantic classes blending along the same direction -- an impossible feat for existing methods. Not only does this lead to a detailed, 3D semantic representation of the scene, but we also show that the regularizing effects of the MLP used for encoding help to improve the semantic representation. We show state-of-the-art segmentation and reconstruction results on a dataset of common objects and demonstrate how the proposed approach can be applied for high quality temporally consistent video editing and re-compositing on a dataset of casually captured selfie videos.

translated by 谷歌翻译

What do you MEME? Generating Explanations for Visual Semantic Role Labelling in Memes

Shivam Sharma , Siddhant Agarwal , Tharun Suresh , Preslav Nakov , Md. Shad Akhtar , Tanmoy Charkraborty

分类：自然语言处理

2022-12-01

Memes are powerful means for effective communication on social media. Their effortless amalgamation of viral visuals and compelling messages can have far-reaching implications with proper marketing. Previous research on memes has primarily focused on characterizing their affective spectrum and detecting whether the meme's message insinuates any intended harm, such as hate, offense, racism, etc. However, memes often use abstraction, which can be elusive. Here, we introduce a novel task - EXCLAIM, generating explanations for visual semantic role labeling in memes. To this end, we curate ExHVV, a novel dataset that offers natural language explanations of connotative roles for three types of entities - heroes, villains, and victims, encompassing 4,680 entities present in 3K memes. We also benchmark ExHVV with several strong unimodal and multimodal baselines. Moreover, we posit LUMEN, a novel multimodal, multi-task learning framework that endeavors to address EXCLAIM optimally by jointly learning to predict the correct semantic roles and correspondingly to generate suitable natural language explanations. LUMEN distinctly outperforms the best baseline across 18 standard natural language generation evaluation metrics. Our systematic evaluation and analyses demonstrate that characteristic multimodal cues required for adjudicating semantic roles are also helpful for generating suitable explanations.

translated by 谷歌翻译

Reinforcement Learning Methods for Wordle: A POMDP/Adaptive Control Approach

Siddhant Bhambri , Amrita Bhattacharjee , Dimitri Bertsekas

分类：人工智能

2022-11-15

In this paper we address the solution of the popular Wordle puzzle, using new reinforcement learning methods, which apply more generally to adaptive control of dynamic systems and to classes of Partially Observable Markov Decision Process (POMDP) problems. These methods are based on approximation in value space and the rollout approach, admit a straightforward implementation, and provide improved performance over various heuristic approaches. For the Wordle puzzle, they yield on-line solution strategies that are very close to optimal at relatively modest computational cost. Our methods are viable for more complex versions of Wordle and related search problems, for which an optimal strategy would be impossible to compute. They are also applicable to a wide range of adaptive sequential decision problems that involve an unknown or frequently changing environment whose parameters are estimated on-line.

translated by 谷歌翻译

A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding

Yifan Peng , Siddhant Arora , Yosuke Higuchi , Yushi Ueda , Sujay Kumar , Karthik Ganesan , Siddharth Dalmia , Xuankai Chang , Shinji Watanabe

分类：自然语言处理

2022-11-10

Collecting sufficient labeled data for spoken language understanding (SLU) is expensive and time-consuming. Recent studies achieved promising results by using pre-trained models in low-resource scenarios. Inspired by this, we aim to ask: which (if any) pre-training strategies can improve performance across SLU benchmarks? To answer this question, we employ four types of pre-trained models and their combinations for SLU. We leverage self-supervised speech and language models (LM) pre-trained on large quantities of unpaired data to extract strong speech and text representations. We also explore using supervised models pre-trained on larger external automatic speech recognition (ASR) or SLU corpora. We conduct extensive experiments on the SLU Evaluation (SLUE) benchmark and observe self-supervised pre-trained models to be more powerful, with pre-trained LM and speech models being most beneficial for the Sentiment Analysis and Named Entity Recognition task, respectively.

translated by 谷歌翻译

Cementron: Machine Learning the Constituent Phases in Cement Clinker from Optical Images

Mohd Zaki , Siddhant Sharma , Sunil Kumar Gurjar , Raju Goyal , Jayadeva , N. M. Anoop Krishnan

分类：计算机视觉

2022-11-06

Cement is the most used construction material. The performance of cement hydrate depends on the constituent phases, viz. alite, belite, aluminate, and ferrites present in the cement clinker, both qualitatively and quantitatively. Traditionally, clinker phases are analyzed from optical images relying on a domain expert and simple image processing techniques. However, the non-uniformity of the images, variations in the geometry and size of the phases, and variabilities in the experimental approaches and imaging methods make it challenging to obtain the phases. Here, we present a machine learning (ML) approach to detect clinker microstructure phases automatically. To this extent, we create the first annotated dataset of cement clinker by segmenting alite and belite particles. Further, we use supervised ML methods to train models for identifying alite and belite regions. Specifically, we finetune the image detection and segmentation model Detectron-2 on the cement microstructure to develop a model for detecting the cement phases, namely, Cementron. We demonstrate that Cementron, trained only on literature data, works remarkably well on new images obtained from our experiments, demonstrating its generalizability. We make Cementron available for public use.

translated by 谷歌翻译